Statistical Mining in Data Streams
نویسندگان
چکیده
Statistical Mining in Data StreamsAnkur Jain Recent years have seen a steady rise of a new class of data management systemscalled Data Stream Management Systems (DSMS). These systems manage rapid, high-volume data-streams with transient relations instead of static data with persistent rela-tions. Data streams are common to applications such as network traffic and transac-tion monitoring systems, click-stream processors, industrial process control, and sen-sor networks. A DSMS operates on these continuous and time-varying data streams tofacilitate on-the-fly query answering, and to support data acquisition, monitoring andanalysis.In this dissertation, we present statistical stream mining solutions for effective on-line processing of streaming data. We focus research issues related to adaptive streamresource conservation and online mining in a DSMS. We have developed statisticallinear and non-linear filtering techniques based on the Kalman Filter to capture tem-poral correlations in the streaming data. Such correlations help in stream resourceconservation. We also propose techniques that capture spatial correlations between thestreaming sources that further helps improving resource conservation and facilitatesanswering group-queries in an efficient manner.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملChapter 9 MINING TEXT STREAMS
The large amount of text data which are continuously produced over time in a variety of large scale applications such as social networks results in massive streams of data. Typically massive text streams are created by very large scale interactions of individuals, or by structured creations of particular kinds of content by dedicated organizations. An example in the latter category would be the...
متن کاملRatio Rule Mining from Multiple Data Sources
Both multiple source data mining and streaming data mining problems have attracted much attention in the past decade. In contrast to traditional association-rule mining, to capture the quantitative association knowledge, a new paradigm called Ratio Rule (RR) was proposed recently. We extend this framework to mining ratio rules from multiple source data streams which is a novel and challenging p...
متن کاملStatistical supports for mining sequential patterns and improving the incremental update process on data streams
Recently the knowledge extraction community takes a closer look to new models where data arrive in timely manner like a fast and continous flow, i.e. data streams. As only a part of the stream can be stored, mining data streams for sequential patterns and updating previously found frequent patterns need to cope with uncertainty. In this paper, we introduce a new statistical approach which biais...
متن کاملAdaptive Mining Techniques for Data Streams using Algorithm Output Granularity
Mining data streams is an emerging area of research given the potentially large number of business and scientific applications. A significant challenge in analyzing/mining data streams is the high data rate of the stream. In this paper, we propose a novel approach to cope with the high data rate of incoming data streams. We termed our approach “algorithm output granularity”. It is a resource-aw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006